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DOCUMENT- IDENTIFIER : 1^5 5712142 A^> > 

TITLE: Method for ij^reasinjg^W^rmo stability in cellulase ennzymes 
Brief Summary Text (6) : 

The cellulase complex produced by this organism is known to contain several 
different cellulase enzymes with maximal activities at temperatures of 75. degree. 
C. to 83. degree. C. These cellulases are resistant to inhibition from cellobiose, 
an end product of the reactions catalyzed by cellulase . Also, the cellulases from 
Acidothermus cellulolyticus are active over a broad pH range centered about pH 6 . A 
high molecular weight cellulase isolated from growth broths of Acidothermus 
cellulolyticus was found to have a molecular weight of approximately 156,600 to 
203,400 daltons by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS- 
PAGE) . This enzyme is described in U.S. Pat. No. 5,110,735. 

Brief Summary Text (7) : 

A novel cellulase enzyme, known as the El endoglucanase, also secreted by 
Acidothermus cellulolyticus into the growth medium, is described in detail in U.S. 
Pat. No. 5,275,944. In its native form, this endoglucanase demonstrates a 
temperature optimum of 83. degree. C. and a specific activity of 40 .mu.mole glucose 
release from carboxymethylcellulose/min/mg protein . This El endoglucanase was 
further identified as having an isoelectric pH of 6.7. It is this El endoglucanase 
which has been modified and made the subject of this patent application. The El 
endoglucanase is a multidomain cellulase having a catalytic domain and a cellulose 
binding domain connected to the catalytic domain by a linker peptide. 

Detailed Description Text (15) : 

P. pastoris has been shown to be a useful host organism for expression of large 
quantifies of diverse heterologous proteins . P. pastoris was used to express large 
quantities of active full size El. 

Detailed Description Text (39) : 

Mutagenized DNA was transformed into E. coli strain ES1301. Transf ormants were 
screened for resistance to ampicillin and sensitivity to tetracycline in order to 
identify clones carrying the putatively mutagenized El gene. Many ampicillin- 
resistant candidate clones were subsequently screened on plates containing 1 mM 4- 
methylumbelliferyl-.beta.-D-cellobioside (MUC) to verify expression of active El. 
Plasmid DNA was prepared from several clones and employed as templates in dideoxy 
DNA sequencing reactions using the Sequenase . RTM. kit (U.S. Biochemical, Cleveland, 
Ohio) to verify the sequence of El DNA in the region of the intended mutation. The 
mutated sequence was detected in every clone which was sequenced. One of these 
clones was selected and designated pYCClOl. Each of the successfully mutated clones 
expresses a protein not present in control cells and which migrates at a molecular 
weight of approximately 42 kDa in SDS-PAGE gels. This 42 kDa protein also reacts 
with a monoclonal antibody specific for the El endoglucanase on Western blots, thus 
confirming its identity as El CAT. 

Detailed Description Text (42) : 

Calorimetric studies of the denaturation of the full size El enzyme and the 
proteolytically cleaved El CAT were carried at pH 5.0 in 50 mM sodium acetate, 
using a Microcal MC-2 differential scanning microcalorimeter over a temperature 
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range of 25 . degree . -95 . degree . C. and using a scan rate of 20. degree. C./h. For the 
examples shown in FIG. 2, the protein concentrations were 0.24 mg/mL for the native 
El enzyme and 0.14 mg/mL for El CAT. 

Detailed Description Paragraph Table (1) : 

SEQUENCE LISTING (1) GENERAL INFORMATION: (iii) NUMBER OF SEQUENCES: 12 (2) 
INFORMATION FOR SEQ ID NO : 1 : (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 30 base 
pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: linear (ii) 
MOLECULE TYPE: DNA (ix) FEATURE: (A) NAME/KEY: El-f primer (xi) SEQUENCE 
DESCRIPTION: SEQ ID NO: 1: CTCGAGAAAAGAGCGGGCGGCGGCTATTGG30 (2) INFORMATION FOR SEQ 
ID NO: 2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 10 amino acids (B) TYPE: amino 
acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE : Protein (ix) 
FEATURE: (A) NAME /KEY : El-f primer (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
LeuGluLysArgAlaGlyGlyGlyTyrTrp 510 (2) INFORMATION FOR SEQ ID NO: 3: (i) SEQUENCE 
CHARACTERISTICS: (A) LENGTH: 24 base pairs (B) TYPE: nucleic Acid (C) STRANDEDNESS: 
double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: (A) NAME/KEY: Elr 
(xi) SEQUENCE DESCRI PTION : . SEQ ID NO: 3: CCTAGGTTAACTTGCTGCGCAGGC2 4 (2) INFORMATION 
FOR SEQ ID NO: 4: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 5 amino acids (B) TYPE: 
amino acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPEj_ 
Protein (ix) FEATURE: (A) NAME/KEY: Elr (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
SerAlaAlaCysAla (2) INFORMATION FOR SEQ ID NO: 5: (i) SEQUENCE CHARACTERISTICS: (A) 
LENGTH: 36 base pairs (B) TYPE: nucleic Acid (C) STRANDEDNESS: double (D) TOPOLOGY: 
linear (ii) MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATTTT C GAT C CT GT CTAAT GAT CT GCAT C GC CT AGC 3 6 (2) INFORMATION FOR SEQ ID NO: 6: (i) 
SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 amino acids (B) TYPE: amino acid (C) 
STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE : Protein (xi) SEQUENCE 
DESCRIPTION: SEQ ID NO: 6: SerSerllePheAspProValGlyAlaSerAlaSerProSerSerGln 51015 
(2) INFORMATION FOR SEQ ID NO: 7: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4 8 base 
pairs (B) TYPE: nucleic acid (C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) 
MOLECULE TYPE: DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

T C GT C GAT T T T C GAT C CT GT C GGC GC GT CT GCAT C GC CT AGCAGT CAA4 8 (2) INFORMATION FOR SEQ ID NO: 
8: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 36 base pairs (B) TYPE: nucleic acid 
(C) STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: 
(A) NAME/KEY: mutagenic olige (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
ATT TT C GAT C CT GT CTAAT GAT CT GCAT C GC CT AGC 3 6 (2) INFORMATION FOR SEQ ID NO: 9: (i) 
SEQUENCE CHARACTERISTICS: (A) LENGTH: 48 base pairs (B) TYPE: nucleic acid (C) 
STRANDEDNESS: double (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: (A) 
NAME/KEY: mutated DNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

T C GT C GAT T TT C GAT C CT GT CTAAT GAT C T GCAT C GC CT AGCAGT CAA4 8 (2) INFORMATION FOR SEQ ID NO: 
10: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 16 amino acids (B) TYPE: amino acid 
(C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE : Protein (ix) 
FEATURE: (A) NAME/KEY: Mutated amino acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
SerSerllePheAspProValXaaXaaSerAlaSerProSerSerGln 51015 (2) INFORMATION FOR SEQ ID 
NO: 11: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 358 amino acids (B) TYPE: amino 
acid (C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE : Protein (ix) 
FEATURE: (A) NAME/KEY: El-CAT (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AlaGlyGlyGlyTyrTrpHisThrSerGlyArgGluIleLeuAspAla 151015 
AsnAsnValProValArglleAlaGlylleAsnTrpPheGlyPheGlu 2 02530 
ThrCysAsnTyrValValHisGlyLeuTrpSerArgAspTyrArgSer 354045 
MetLeuAspGlnlleLysSerLeuGlyTyrAsnThrlleArgLeuPro 505560 
TyrSerAspAspIleLeuLysProGlyThrMetProAsnSerlleAsn 65707580 
PheTyrGlnMetAsnGlnAspLeuGlnGlyLeuThrSerLeuGlnVal 859 095 
MetAspLysIleValAlaTyrAlaGlyGlnlleGlyLeuArgllelle 100105110 
LeuAspArgHisArgProAspCysSerGlyGlnSerAlaLeuTrpTyr 115120125 
ThrSerSerValSerGluAlaThrTrpIleSerAspLeuGlnAlaLeu 130135140 
AlaGlnArgTyrLysGlyAsnProThrValValGlyPheAspLeuHis 145150155160 
AsnGluProHisAspProAlaCysTrpGlyCysGlyAspProSerlle 165170175 
AspTrpArgLeuAlaAlaGluArgAlaGlyAsnAlaValLeuSerVal 180185190 
AsnProAsnLeuLeuIlePheValGluGlyValGlnSerTyrAsnGly 195200205 
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AspSerTyrTrpTrpGlyGlyAsnLeuGlnGlyAlaGlyGlnTyrPro 210215220 
ValValLeuAsnValProAsnArgLeuValTyrSerAlaHisAspTyr 225230235240 
AlaThrSerValTyrProGlnThrTrpPheSerAspProThrPhePro 245250255 
AsnAsnMetProGlyileTrpAsnLysAsnTrpGlyTyrLeuPheAsn 2602 65270 
GlnAsnlleAlaProValTrpLeuGlyGluPheGlyThrThrLeuGln 275280285 
SerThrThrAspGlnThrTrpLeuLysThrLeuValGlnTyrLeuArg 2902 95300 
ProThrAlaGlnTyrGlyAlaAspSerPheGlnTrpThrPheTrpSer 305310315320 
TrpAsnProAspSerGlyAspThrGlyGlylleLeuLysAspAspTrp 325330335 

GlnThrValAspThrValLysAspGlyTyrLeuAlaProIleLysSer 340345350 SerllePheAspProVal 
355358 (2) INFORMATION FOR SEQ ID NO: 12: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 
2293 base pairs (B) TYPE: nucleic acid (C) STRANDEDNESS : double (D) TOPOLOGY: 
linear (ii) MOLECULE TYPE: DNA (ix) FEATURE: (A) NAME/KEY: El-CAT (xi) SEQUENCE 
DESCRIPTION: SEQ ID NO: 12: 

GGAT CCAC GTT GT ACAAGGT CAC CT GT C C GT C GTT CT GGT AGAGC GGCGGGAT GGT CAC C 6 0 
CGCACGATCTCTCCTTTGTTGATGTCGACGGTCACGTGGTTACGGTTTGCCTCGGCCGCG120 
ATTTTCGCGCTCGGGCTTGCTCCGGCTGTCGGGTTCGGTTTGGCGTGGTGTGCGGAGCAC180 
GC C GAGGC GAT C C CAAT GAGGGCAAGGGCAAGAGC GGAGC C GAT GGCAC GT C GGGT GGC C 2 4 0 
GATGGGGTACGCCGATGGGGCGTGGCGTCCCCGCCGCGGACAGAACCGGATGCGGAATAG3 0 0 
GT CAC GGT GC GAC AT GT T GC C GT AC C GC GGAC C C GGAT GACAAGGGT GGGT GC GC GGGT C 3 6 0 
GCCTGTGAGCTGCCGGCTGGCGTCTGGATCATGGGAACGATCCCACCATTCCCCGCAATC420 
GAC GC GAT C GGGAGCAGGGCGGC GC GAGC C GGAC C GT GT GGT C GAGC C GGAC GATT C GC C 4 8 0 
CAT ACGGT GCT GCAAT GC C CAGC GC CAT GT T GT CAAT C C GCCAAAT GC AGCAAT GCACAC 5 4 0 
ATGGACAGGGATTGTGACTCTGAGTAATGATTGGATTGCCTTCTTGCCGCCTACGCGTTA600 
CGCAGAGTAGGCGACTGTATGCGGTAGGTTGGCGCTCCAGCCGTGGGCTGGACATGCCTG660 
CT GC GAAC T CTT GAC ACGT CT GGTT GAACGC GCAAT ACT C C CAAC AC C GAT GGGAT C GTT 720 
CC CATAAGTTT CCGT CTCACAACAGAAT CGGT GC GCCCT CAT GATCAACGT GAAAGGAGT 7 8 0 
ACGGGGGAGAACAGACGGGGGAGAAACCAACGGGGGATTGGCGGTGCCGCGCGCATTGCG8 4 0 
GCGAGTGCCTGGCTCGCGGGTGATGCTGCGGGTCGGCGTCGTCGTCGCGGTGCTGGCATT900 
GGTTGCCGCACTCGCCAACCTAGCCGTGCCGCGGCCGGCTCGCGCCGCGGGCGGCGGCTA960 
TTGGCACACGAGCGGCCGGGAGATCCTGGACGCGAACAACGTGCCGGTACGGATCGCCGG1 02 0 
CATCAACTGGTTTGGGTTCGAAACCTGCAATTACGTCGTGCACGGTCTCTGGTCACGCGA1 080 
CT AC CGCAGCATGCT C GAC CAGATAAAGT C GCT C GGCTACAACACAAT C C GGC T GC C GT A 1 140 
CTCTGACGACATTCTCAAGCCGGGCACCATGCCGAACAGCATCAATTTTTACCAGATGAA12 0 0 
TCAGGACCTGCAGGGTCTGACGTCCTTGCAGGTCATGGACAAAATCGTCGCGTACGCCGG1260 
T CAGAT C GGCCT GC GCAT CAT T CTT GAC C GC CAC C GAC C GGATT GCAGC GGGC AGT C GGC 1320 
GCTGTGGTACACGAGCAGCGTCTCGGAGGCTACGTGGATTTCCGACCTGCAAGCGCTGGC138 0 
GCAGCGCTACAAGGGAAACCCGACGGTCGTCGGCTTTGACTTGCACAACGAGCCGCATGA1 44 0 
CCCGGCCTGCTGGGGCTGCGGCGATCCGAGCATCGACTGGCGATTGGCCGCCGAGCGGGC1500 
C GGAAAC GCCGTGCTCTC GGT GAAT C C GAAC CT GCT CAT T T T CGT C GAAGGT GT GCAGAG 1560 
CTACAACGGAGACTCCTACTGGTGGGGCGGCAACCTGCAAGGAGCCGGCCAGTACCCGGT1 62 0 
CGTGCTGAACGTGCCGAACCGCCTGGTGTACTCGGCGCACGACTACGCGACGAGCGTCTA1 68 0 
C C C GC AGAC GT GGT T CAGC GAT C C GAC CT T C C C C AACAAC AT GC C C GGCAT C T GGAAC AA1 740 
GAACTGGGGATACCTCTTCAATCAGAACATTGCACCGGTATGGCTGGGCGAATTCGGTAC1 8 0 0 
GACACTGCAATCCACGACCGACCAGACGTGGCTGAAGACGCTCGTCCAGTACCTACGGCC 18 60 
GAC CGC GCAAT AC GGT GC GGACAGCTT C CAGT GGACCTT CT GGT C CT GGAAC C C C GAT T C 1 9 2 0 
CGGC GACACAGGAGGAATTCT CAAGGAT GACTGGCAGAC GGT CGACACAGTAAAAGACGG 1 98 0 
CTAT CTC GCGCCGAT CAAGT CGTCGATTTT CGAT CCT GT CTAAT GAT CTGCAT CGC CTAG2 040 
CAGTCAACCGTCCCCGTCGGTGTCGCCGTCTCCGTCGCCGAGCCCGTCGGCGAGTCGGAC2 100 
GCCGACGCCTACTCCGACGCCGACAGCCAGCCCGACGC CAAC GCT GACCCCTACTGCTAC2 160 
GC C CAC GC C CACGGCAAGC C C GACGC C GT CAC C GACGGCAGC CT C C GGAGC C C GCT GCAC 2220 

CGCGAGTTACCAGGTCAACAGCGATTGGGGCAATGGCTTCACGGTAACGGTGGCCGTGAC2 2 8 0 AAATTCCGGATCC22 93 



CLAIMS : 

9. The DNA according to claim 7 wherein the DNA encodes a protein having an 
endoglucanase activity. 

13. The DNA according to claim 2 wherein the DNA encodes a protein having an 
endoglucanase activity. 
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